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CONSTRAINED H 1 -REGULARIZATION SCHEMES FOR DIFFEOMORPHIC IMAGE 

REGISTRATION* 

ANDREAS MANG* AND GEORGE BIROS* 


Abstract. We propose regularization schemes for deformable registration and efficient algorithms for their numerical approxi¬ 
mation. We treat image registration as a variational optimal control problem. The deformation map is parametrized by its velocity. 
Tikhonov regularization ensures well-posedness. Our scheme augments standard smoothness regularization operators based on H 1 - 
and H 2 -seminorms with a constraint on the divergence of the velocity field, which resembles variational formulations for Stokes 
incompressible flows. In our formulation, we invert for a stationary velocity field and a mass source map. This allows us to explicitly 
control the compressibility of the deformation map and by that the determinant of the deformation gradient. We also introduce a 
new regularization scheme that allows us to control shear. 

We use a globalized, preconditioned, matrix-free, reduced space (Gauss-)Newton-Krylov scheme for numerical optimization. 
We exploit variable elimination techniques to reduce the number of unknowns of our system; we only iterate on the reduced space 
of the velocity field. Our current implementation is limited to the two-dimensional case. 

The numerical experiments demonstrate that we can control the determinant of the deformation gradient without compromising 
registration quality. This additional control allows us to avoid oversmoothing of the deformation map. We also demonstrate that we 
can promote or penalize shear while controlling the determinant of the deformation gradient. 

Key words, stationary velocity field diffeomorphic registration, constrained regularization schemes, optimal control, variable 
elimination, volume conservation, shear control, inexact Newton-Krylov method. 
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1. Introduction. Image registration is a key technology in computer vision and imaging sciences. Ap¬ 
plications include surveillance, remote sensing, motion tracking, and medical image analysis. Lucid and 
concise expositions on image registration can be found in [65, 67, 85]. The problem of image registration 
can be stated as follows: Given a reference image mg : Q — R and a template image mj : Q — R with 
compact support on Q C R^, d E {2,3}, we seek a plausible map y : Q —>> R d such that the distance 
between m r and mj o y is as small as possible; Q := Q U 90 denotes the closure of Q with boundary 
90, and the operator o is the function composition. If we use an L 2 -distance to measure the proximity 
between m r and mj o y we can formulate image registration as a variational optimization problem 
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Deformable registration is an ill-posed, nonlinear, and nonconvex optimization problem—regularization 
is inevitable. The key idea of regularization is to stably compute a solution to a nearby problem. A variety 
of regularization schemes have been proposed, for example, [16, 17, 22, 24, 23, 29, 32, 33, 34, 44, 45, 60, 62]. 
Regularization is typically based on some Tikhonov functional that is added to the objective, which—in 
the case of (1) —is a quadratic norm, the contribution of which is controlled by the weight f> > 0. The 
particular choice of the regularization model depends on the application. This is also true for the measure 
of the proximity between mg and mj o y; different choices can be found in [65, 67, 85]. 

A key requirement in many applications, especially in medical imaging, is that the map y is a dif- 
feomorphism [10, 17, 30, 89, 91], i.e., y is a bijection, continuously differentiable, and has a continuously 
differentiable inverse. Formally, we require that det(Vi/) 7 ^ 0 for every x E O, where Vy E R dxd is the 
Jacobian of the deformation map y. Under the assumption that y is orientation preserving we require 
that det(Vy) > 0 for every x E O. In practice, we would like to control the distance of det(Vy) from 
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zero. 1 Generally speaking, the type and weight of regularization are selected to drive the optimizer to 
diffeomorphic maps y at reasonable computational cost while enabling a good registration between tur 
and mj. In the framework of large deformation diffeomorphic image registration we do not directly 
invert for the deformation map y but for its velocity v. Broadly speaking, we can distinguish between 
approaches that invert for stationary [3, 4, 47, 59, 58, 91] and those that invert for nonstationary veloc¬ 
ity fields [10, 22, 30, 89]. The proposed formulation uses a stationary velocity, although in principle the 
extension to a nonstationary velocity is straightforward. In either case, the search space for y is typ¬ 
ically restricted to the manifold of diffeomorphisms by specifying adequate smoothness requirements 
for v [5, 10, 30, 47, 89]. However, this smoothness control may result in oversmoothing or may lead to 
det(Vi/) ^ 0 or even det(Vi/) < 0 [5]. 

Here, we propose constrained regularization schemes for v that allow us to control det(Vi/) and the 
amount of shear in the deformation map y. We follow up on [60] where we introduced numerical schemes 
for our optimal control based large deformation diffeomorphic image registration formulation (for both 
stationary and nonstationary velocity fields). In particular, we considered two models—one for compress¬ 
ible and one for incompressible diffeomorphisms y. 2 For the incompressible case we hypothesized that 
fixing det(Vi/) to one yields more well-behaved mappings as compared to plain smoothness regulariza¬ 
tion. We found that enforcing det(Vi/) = 1 up to numerical accuracy seems to be a too strong constraint 
for our formulation to be applicable across a wide range of registration problems. We also found that 
enforcing det(Vi/) = 1 can lead to excessive shear in the deformation map. 

In the present work, we propose new regularization schemes to address these issues. We introduce 
a mass source w : Q -A R as an additional unknown to our variational optimization problem. Concep¬ 
tually, this is equivalent to replacing the incompressibility constraint by a soft constraint (penalty) on the 
divergence of v (see, e.g., [14]). Our formulation avoids ill-conditioning issues in case we set V • v to a 
specified value (e.g., zero). We refer to this scheme as linear Stokes regularization. Our hypothesis is that the 
obtained maps y are better behaved (smaller variations of the determinant of the deformation gradient) 
without compromising registration quality as compared to plain smoothness regularization, e.g., used 
in [5, 6, 7, 10, 30, 46, 89, 87, 91, 92]. A similar reasoning can be found in connection with hyperelastic 
regularization models [17, 29]. 

Our overarching goal is to design a biophysically constrained framework for large deformation dif¬ 
feomorphic image registration. Constraints can range from complicated biophysical priors, such as brain 
tumor models [35, 36, 63, 49] or cardiac motion models [86], to—like in the present case—simpler models 
of (nearly) incompressible tissue. The general idea is to favor diffeomorphic deformation maps that have 
minimal volume changes without compromising data fidelity. An interesting application for incompress¬ 
ible diffeomorphisms is motion estimation in cardiac imaging [13, 37, 64, 86]. Here, it is expected (at least 
for healthy individuals) that the volume of the heart muscle does not vary significantly during a cardiac 
cycle; the deformation map is incompressible. Other applications for (near-)incompressible diffeomorphic 
registration include time series of abdominal images of a single individual, i.e., images of the liver or the 
kidneys. Here, we also expect the tissue to mostly behave like an incompressible material. Notice that 
our new formulation relaxes the incompressibility constraint—it is possible to compute deformation maps 
that have large local volume changes; we will demonstrate this experimentally. 

We, in addition to that, introduce a new regularization scheme that allows us to promote or penalize 
shear. This formulation also operates in a near-incompressible regime and is motivated from continuum 
mechanics [73, 75]. In some registration problems the optimizer might drive us to maps that introduce 
excessive shear. Our new formulation allows us to penalize shear in order to generate maps that are 
well behaved, guaranteed to be diffeomorphic, and potentially (near-)incompressible. On the contrary, we 
can also promote shear using the same formulation by simply changing the value of a single parameter. 
This may be of interest in applications where we expect sharp interfaces (large shear) in the deformation 
map. We refer to this scheme as nonlinear Stokes regularization. We will see that this formulation is—in the 
limit—equivalent to total variation regularization [20, 23, 34]. 


1 Monitoring det(Vi/) does not guarantee that volume elements do not collapse [17, 42, 60]. In practice, we have to monitor 
geometric properties of the deformed grid cells. 

2 Related work on incompressible diffeomorphisms can be found in [19, 20, 48, 64, 80, 81]; see §1.4. 
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1.1. Outline of the Method. We introduce a pseudo time variable t > 0 and solve for a stationary 
velocity field v € V, V C L 2 (Ci) d , and a mass source w e W, W C L 2 (Cl), as follows: 
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and periodic boundary conditions on 9Q. The state variable m : Ox [0,1] -A R in (2b) models the 
transported intensities of the template image mj : Q -A R subjected to the stationary velocity field v. 
The deformation map y is not computed explicitly. 3 Instead, the solution of (2b), i.e. m\ :s=s ra( •, t = 1), 
mi : Q -A R, represents rap o y in (1), where y represents an Eulerian deformation map. We, like in (1), use 
an L 2 -distance to measure the proximity between m\ and rap. The objective in (2a) additionally consists 
of two regularization models that act on the controls v and w with weights fi v and $ w , respectively. We 
provide more details on the choice for the associated norms and the choices for q > 0 in §2. 

We augment the regularization on v by a constraint on the divergence of the control v in (2d). Setting 
w in (2d) to zero yields a model of incompressible flow [19, 20, 48, 60, 64, 80]. This is equivalent to 
enforcing det(Vy) = 1 up to numerical errors (see [39, p. 70ff.]). We relax this incompressibility constraint 
by introducing an unknown mass source w, which is determined by solving (2). 

In §3 we will see that the optimality system for (2) is a system of space-time nonlinear multicomponent 
PDEs for the transported intensities m, the velocity field v, the mass source w, and the adjoint variables 
for the transport and divergence condition. Solving this system poses significant challenges. We follow 
our former work [60] and solve for the first-order optimality conditions using a globalized, matrix-free, 
preconditioned inexact (Gauss-)Newton-Krylov method for the Schur complement of the velocity v. We 
first derive the optimality conditions and then discretize using a pseudospectral discretization in space 
with a Fourier basis. 


1.2. Contributions. The main goal of this work is to introduce and put to the test new regulariza¬ 
tion schemes for large deformation diffeomorphic image registration. We use the solver and numerical 
techniques we have described in [60] to efficiently solve the associated optimization problem. Our main 
contribution is the formulation and the derivation of the systems. In particular, the contributions are as 
follows: 

• We extend existing work on continuum mechanical models for incompressible flow [14, 19, 20, 
48, 60, 64, 80, 81] by introducing a mass source w into the variational optimization problem. This 
results in a formulation that is more flexible in that we do not fix det(Vy) to one. 

• We propose a novel H 1 -regularization scheme that yields a continuum mechanical model with a 
viscosity that depends on the strain rate tensor (non-Newtonian fluid). This allows us to explicitly 
control the resistance of the fluid to shear stress and by that promote (shear thinning fluid) or 
suppress (shear thickening fluid) large shear in the map y. We will see that this model has a 
strong resemblance of total variation regularization [20, 34]. 

• By using Lagrange multipliers to control the divergence, our formulation avoids ill-conditioning 
issues in case V • v is set to a specified value (for example, w = 0). Our numerical discretization 
(pseudospectral) allows us to construct fast solvers for the optimality system; we only iterate on 
the reduced space of the velocity field v. 

• We provide second order information for numerical optimization. 4 Although second order meth¬ 
ods have widely been used in traditional, variational registration approaches (see, e.g., [67]), there 


3 The Eulerian deformation map y and the deformation gradient F\ := (Vi/) -1 are computed from v (see §D.2). 

4 Our globalized, matrix-free, preconditioned Newton-Krylov scheme has been described in [60]. We do not view the numerical 
scheme as a major contribution but the formulation and the derivation of the associated optimality conditions. A study of our 
numerical scheme, which includes a comparison to a preconditioned gradient descent algorithm (i.e., our algorithm uses the reduced 
gradient in the Sobolev space V), can be found in [60]. 
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has been little work on the use of Newton-type optimization in the framework of large deforma¬ 
tion diffeomorphic image registration [5, 46, 60]. Most work in this area is still based on first order 
numerical optimization strategies [6, 7, 14,19, 20, 43, 47, 54, 55, 92]. 

• We study the effect of incompressibility and smoothness regularization on the overall registration 
quality as a function of the regularization parameters. We quantify registration accuracy in terms 
of overlap measures and compare our results against the diffeomorphic DEMONS algorithm [91]. 
We demonstrate that, by introducing a constraint on V • v, we can control det(Vi/) without com¬ 
promising registration quality. We show that our model allows us to avoid oversmoothing of the 
deformation map. We also study the effect of controlling shear. 

1.3. Limitations and Unresolved Issues. Here, we summarize the limitations and unresolved issues 
of our work: 

• We introduce an additional regularization parameter. This makes it more difficult to design a 
black-box solver and more expensive to automatically calibrate the algorithm. 

• We assume similar intensity statistics of tur and mj. This is a common assumption in many 
image registration approaches [10, 19, 34, 43, 47, 54, 68, 92]. For multimodal registration problems 
different distance measures have to be considered (see e.g., [65, 85]). 

• We present results only in two dimensions. Nothing in our formulation and numerical approx¬ 
imation is specific to the two-dimensional case. Overall, the method is very expensive and a 
practical three-dimensional implementation requires more work. 

• We only report results for stationary velocities (see, e.g., [4, 47, 91]). We have implemented and 
tested time-varying velocities (in [60] we report results for incompressible y). For a two-image 
registration problem we found that a velocity that changes in time does not improve the quality 
of the registration. For tracking problems like optical flow [14, 19, 20, 50, 52, 80] or time series 
of medical images [61] a time-dependent velocity may be necessary. Nothing changes in our 
formulation, just the problem size (see [60]). 

1.4. Related Work. There is a vast body of literature on image registration. Here, we restrict the 
discussion to approaches that are closely related to our work. We refer to [65, 67, 85] for a more general 
overview on algorithmic developments and formulations. 

Our approach shares numerous characteristics with methods that have appeared in the past. Optimal 
control formulations for image registration have been discussed in [14, 19, 43, 54, 55, 60, 92]. Our work is 
related to large deformation diffeomorphic metric mapping [6, 7, 10, 30, 89, 95] (see [60] for a connection). 
It differs in that we, like [3, 4, 47, 59, 58, 91], invert for a stationary velocity field. Our formulation also 
shares conceptual ideas with traditional optical flow formulations [50, 52, 80]. We refer to [60] for a more 
detailed discussion. In this review we will focus on approaches that (z) introduce mass conservation as an 
additional constraint and/or (zz) aim at recovering motion fields that locally contain significant shear. 

One way to explicitly control det(Vi/) is to set it to one. This is equivalent to working with incom¬ 
pressible velocity fields (see [39, p. 77ff.]); we refer to this model as linear Stokes regularization. Formulations 
based on divergence-free velocity fields have been described in [14, 19, 48, 60, 64, 80, 81]. None of these 
consider an inversion for a mass source w. All of these, with the exception of our preceding work [60], 
use first order information only for numerical optimization. Other formulations for controlling det(Vi/) 
can be found in [1, 8, 17, 40, 41, 42, 56, 57, 66, 69, 74, 77, 79, 83, 94]. 

We, in addition to that, introduce a continuum mechanical model that controls shear (either promoting 
or penalizing it). We refer to this formulation as nonlinear Stokes regularization. Related approaches based 
on nonquadratic regularization norms (L 1 -norm or total variation) have been described in [20, 23, 34, 78, 
96]. In our formulation, the regularization is a function of the shear strain rate. This allows us to explicitly 
control the amount of shear in the deformation map. Whether such an explicit control is beneficial in 
certain applications remains to be seen. We will see that our formulation is in the limit equivalent to total 
variation regularization [20, 23, 34]. Our model couples the individual components of the regularized 
vector field as opposed to component wise vectorial total variation regularization [20, 34]. 

Other approaches for estimating an expected discontinuous motion field include locally adaptive (i.e., 
direction-dependent and/or intensity-driven) regularization [71, 72, 82], are based on a decomposition of 
the bodyforce [8], or are based on a subdivision of the domain [76, 79, 93]. The formulations in [23, 24, 
34, 78, 82, 71] operate on the deformation map or the displacement field. Our formulation operates on the 
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velocity field instead and as such falls into the category of large deformation models. We can, like in the 
linear Stokes case, control the magnitude of det(Vi/). All mentioned approaches for estimating sliding 
motion, with the exception of [ 8 , 78, 79], do not feature such a control. Our formulation—as opposed 
to [ 8 , 71, 76, 79, 82, 93] —does not require any partitioning of the domain (presegmentation). We exploit 
the second order Newton-Krylov scheme we have introduced in [60], which further distinguishes us from 
most of the preceding work. Our formulation allows us to control the shear within the estimated motion 
field on the basis of a single, strictly positive parameter; we can not only promote shear but also penalize 
it. 


1.5. Organization and Notation. We provide additional details on the optimal control formulation 
in §2. The optimality system is summarized in §3. The numerics are described in §4 (we refer to [60], 
where we originally introduced our numerical scheme, for more details). We report experiments in §5 and 
conclude with § 6 . Additional derivations, comments, algorithmic details, and measures of registration 
performance can be found in the appendix. 

An overview of the commonly used symbols can be found in Tab. 1. Vectorial quantities and matrices 
are denoted in boldface. Function spaces, differential operators, and functionals are denoted in calligraphy. 
A superscript h is added to the variables whenever we refer to discretized quantities. 

2. Problem Formulation. The images to be registered are modeled as compactly supported functions 
on the domain Q := ( — TC,n) d C R d , d £ {2,3}, with boundary 9Q, and closure Q := Q U 9Q. We 
introduce a pseudo time variable t > 0 and solve for a stationary velocity field v £ V, V C L 2 (Q) rf , and a 
mass source w £ W, W C L 2 (fi), as follows: 


(3a) 
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and periodic boundary conditions on 9Q. We do not directly invert for the map y but for its velocity 
v. This is different from the problem formulation in (1); in our formulation, the solution of (3b) — m\ := 
m(-,t = 1), m\ : Q -a R —is equivalent to mj o y in (1), where y is the deformation map defined in 
an Eulerian frame of reference. We, as in (1), use an L 2 -distance to measure the proximity between the 
deformed template image m\ and the reference image thr. This is a common choice in many deformable 
image registration algorithms (see, e.g., [10, 54, 68 , 92]). The parameters > 0 and > 0 control the 
contribution of the regularization norms. The parameter 7 £ {0,1} is introduced for clarity. If we set 
7 = 0 we obtain a formulation that is related to available models for large deformation diffeomorphic 
image registration [4, 10, 43, 47, 92, 91] (see [43, 60] for a more detailed insight ). 5 If we set 7 to one 
and w to zero, we obtain a model for incompressible diffeomorphisms; i.e., we enforce det(Fi) = 1 up 
to numerical accuracy; the tensor field : Q -A R dxd is the deformation gradient at t = 1 computed 
from v (see §D.2 for details). Similar approaches for incompressible diffeomorphisms have been described 
in [19, 48, 60, 64, 80, 81]. We extend these by introducing a nonzero mass source w. This allows us to relax 
the model from incompressible diffeomorphisms to a model of near-incompressible diffeomorphisms. The 
regularization on w in (3a) acts like a penalty on V • v; we use an H 1 -norm: 


(4) 
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= / Vw • Vw + w 2 dx. 


5 We invert for a stationary velocity field. Other work on diffeomorphic registration based on stationary velocity fields can, for 
instance, be found in [3, 4, 47, 59, 58, 91]. The traditional large deformation diffeomorphic metric mapping framework introduced 
in [10, 30, 88, 89, 95] uses time-dependent velocity fields. Our implementation allows us to invert for time-dependent and stationary 
velocity fields alike (in case quadratic regularization models are considered); nothing changes in our formulation, just the number 
of unknowns (see [60] for details). Here, we limit ourselves to stationary v. 
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Table 1 

Commonly used notation and symbols. 


Symbol/Notation 

Description 

CFL 

Courant-Friedrichs-Lewy (condition) 

DSC 

Dice similarity coefficient 

eft 

fast Fourier transform 

FNE 

false negative error 

FPE 

false positive error 

JSC 

Jaccard similarity coefficient 

KKT 

Karush-Kuhn-Tucker 

matvec 

matrix-vector product 

PDE 

partial differential equation 

PDE solve 

solution of the hyperbolic transport equations 

PCG 

preconditioned conjugate gradient (method) 

RK2 

2nd order Runge-Kutta (method) 

d 

spatial dimensionality; typically d £ {2,3} 

O 

spatial domain; O := (— n, n) d C R d with boundary 30 and closure O := O U 30 

X 

spatial coordinate; x := (x 1 ,..., x d ) T £ R d 

Mr 

reference image; mR : O -A R 

nij 

template image; mj : O -A R 

m 

state variable (transported intensities); m : O x [0,1] -A R 

mi 

deformed template image (state variable att = l);rai:0— 

A 

adjoint variable (transport equation); A : O x [0,1] —> R 

V 

adjoint variable (incompressibility constraint); p : O -A R 

V 

control variable (stationary velocity field); v : O -A R d 

w 

control variable (mass source); w : O -A R 

b 

bodyforce; b : O -A R d 

n 

(reduced) Hessian 

g 

(reduced) gradient 

F i 

deformation gradient at t = 1; Fi : O -A R dxd ; = F(-,t = 1), = (Vi/) _1 

fa 

regularization parameter for the control v 

f>w 

regularization parameter for the control w 

V 

Glen's flow law exponent; v > 0 

S 

strain rate tensor; £[v\ := j((Vu) + (Vu) T ) 

A 

regularization operator (variation of regularization model acting on v) 

V 

gradient operator (acts on scalar and vector fields) 

A 

Laplacian operator (acts on scalar and vector fields) 

V- 

divergence operator (acts on vector and 2nd order tensor fields) 


We use H 1 - and H 2 -seminorms to regularize v; in particular. 


( 5 ) 


1 ^( 0 ) 


:= J \7v :\7vdx and M^2(n) := J Az^Audx, 


respectively. The choice of an H 2 -seminorm is motivated by related work on large deformation diffeomor- 
phic image registration (see, e.g., [10, 43, 92]). The choice of an H 1 -seminorm is motivated by the fact that 
we obtain optimality conditions that are similar to Stokes equations in fluid mechanics; we refer to this 
(near-)incompressible formulation as "linear Stokes regularization". 

Since we observed that a model of incompressible flow may promote shear, we additionally introduce 
a nonlinear regularization model that allows us to control (promote or penalize) shear in the deformation 
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field in a problem-dependent way. This model is motivated from continuum mechanics 6 and given by 

(6) Wgi'S 72 '' = ^p[f n Q(( V ^) + ( Vp ) T ) : + (Vf) T )J 1+V /2V dx. 

Here, v > 0 controls the nonlinearity and + (Vz?) T ) =: £[v] is the strain rate tensor. We will see 

that we arrive at a Stokes-like optimality system with a viscosity that depends on the strain rate (see §3 
for details). For v £ (0,1) we obtain a shear thickening and for v > 1 a shearthinning fluid. Notice that we 
approach a total variation like regularization model as v in (6) tends to oo (see §C). Thus, we can explicitly 
control the shear within y via v. This model, in combination with the incompressibility constraint, yields 
a deformation map for which det(Fi) = 1. This is a fundamental difference to most existing models 
for estimating sliding motion 7 (with the exception of [8, 78, 79]), since these in general do not explicitly 
control the determinant of the deformation gradient. We have also tested a version of this model with 
a relaxed incompressibility constraint. We refer to this formulation as " nonlinear Stokes regularization ". 
Notice, that the derivation we describe in the present work will also hold if we replace (6) with a total 
variation regularization model. 

Remark on some theoretical considerations: There are several questions at hand. A first question 
regards an appropriate choice of the space for v so that the transport equation (3b) has a unique solution 
and preserves the smoothness of the initial image ra(-, 0) = mj. The answer to this question depends 
on smoothness of the input images mj and m r. A second question regards the sufficient regularity 
of the adjoint variables, which is required to justify a gradient-descent scheme for solving (3). Finally, 
a third question regards the existence and uniqueness of the solution of (3). If the input images mj 
and are adequately smooth, and the regularization space and weights are large enough, then the 
transport equation has a unique solution, smoothness is preserved, the adjoint variables are smooth, and 
the problem has a solution. Uniqueness requires an even stronger regularization to ensure the convexity 
of the problem. In our experiments we always use smooth images (we apply a Gaussian smoothing 
operator to the input images). However, it is not known if our regularization scheme for the velocity 
is in the theoretical limit sufficient for all three questions to have an affirmative answer. Numerically, 
we control the velocity by adjusting the regularization weight to ensure we obtain diffeomorphic maps. 
Informally, our experimental studies suggest that H 1 -regularity of v and the penalization of V • v can 
provide sufficient smoothness as long as the regularization parameters are sufficiently large. We provide 
a lengthier discussion in §B, based on work of other authors [18, 19, 26, 28]. A detailed theoretical analysis 
is beyond the scope of this work and remains open for future work. 

3. Optimality Conditions. We use the method of Lagrange multipliers to solve (3). The Lagrangian 
reads 

(7) C[<p\ :=J[v,w] + j (d t m + Vm • v,\) l2{0) dt + (m - m T ,p ) L2(n) - (V • v - w,p) L 2(n) 

with Lagrange multipliers A:Qx[0,l]4R for the hyperbolic transport equation (3b), y : Q -A R for the 
initial condition (3c), and p : Q —x R for the incompressibility constraint (3d) (p is referred to as pressure in 
fluid dynamics); cp := (m, v, w, A, y, p) and (•, -) L 2 (q) denotes the standard L 2 inner product defined on Q. 

Our algorithm falls into the class of reduced space Newton-Krylov methods [11,12]. This will also be 
reflected by the optimality systems we present below. The interested reader is referred to §E for a more 
detailed explanation of the optimality systems, the conceptual ideas behind our algorithm, and details for 
its implementation. We describe our Newton-Krylov algorithm in more detail in [60]. We refer to [15, 38] 
for general information on optimal control theory and PDE constrained optimization; the conceptual ideas 
we use for our optimization scheme are described in [70]. We use an optimize-then-discretize approach. 8 
The resulting optimality conditions is what we discuss next. 

6 We will arrive at a formulation that resembles continuum mechanical models that can be found in geoscience applications [51, 
73, 75]. 

7 We note that our formulation allows us to only approximate discontinuous motion fields (sliding motion) by promoting shear; 
the computed map will still be continuous. 

8 For a discussion on advantages and disadvantages of the optimize-then-discretize and the discretize-then-optimize approach 
we refer to [38]. 
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3.1. First order Optimality System. From Lagrange multiplier theory we know that the variations of 
C with respect to all variables have to vanish for an admissible solution of (3). Taking variations of C in (7) 
with respect to m, v, w, A, y, and p, in directions in, v, w, A, y, and p, and applying integration by parts, 
yields the optimality system (i.e., the first order necessary optimality conditions (KKT conditions) in strong 
form) 


( 8 a) 

dtm + Vra • v = 0 

in Ox (0,1], 

( 8 b) 

a 

1 

a 

H 

II 

O 

in Q x {0}, 

( 8 c) 

— dt A — V • (vX) = 0 

in Ox [0,1), 

( 8 d) 

A + {m — mp) = 0 

in Ox {1}, 

( 8 e) 

< 

1 

II 

0 

in Q, 

( 8 f) 

g-o := PvA[v] + 7 Vp + b = 0 

in Q, 

( 8 g) 

gw := tGM-Am; + w) + p) = 0 

in Q, 


subject to periodic boundary conditions on 9Q. The parameter 7 E {0,1} enables or disables the constraint 
on the divergence of v in ( 8 e). Further, 


b := / AVradf 
Jo 

is the body force. The operator — A + id (where id is the identity operator) in ( 8 g) is the first variation of the 
H 1 -norm in (4). The operator A in ( 8 f) is the first variation of the regularization model for v. In particular, 
we have 

(9) A[v] = —Av and A[v\ = A 2 v 
for the H 1 - and the H 2 -seminorm in (5), respectively . 9 Further, we have 

( 10 ) A[v] = —V • 2rj[v\£[v\ 

if we consider the regularization model in ( 6 ), where rj[v] := tr(£[v]£ [u])( 1_v ) //2v is the effective viscosity, 
v > 0 is Glen's flow law exponent, and £[v] := ^((Vu) + (Vu) T ) is the strain rate tensor. 

In the language of optimal control ( 8 a) is referred to as the state equation (with initial condition ( 8 b)), 
( 8 c) as the adjoint equation (with final condition ( 8 d)), and ( 8 f) and ( 8 g) as the control equations, respectively. 
Notice that the adjoint equation models the transport of the mismatch between m\ (deformed template 
image) and m r backward in time; A will (ideally) tend to zero if we approach the solution of our problem. 

We can directly use the optimality system in ( 8 ) to design an iterative scheme for computing a solu¬ 
tion to (3). This will result in a first order gradient descent scheme, which is still widely used in large 
deformation diffeomorphic image registration [10, 43, 92] despite its linear convergence. As we have seen 
in [60] (for the compressible and the incompressible case) preconditioned gradient descent schemes 10 are 
inferior to preconditioned Newton-Krylov schemes in case we strive for high inversion accuracy. How¬ 
ever, exploiting second order information requires more work; we have to derive the second variations of 
the Lagrangian. This is what we present next. 

3.2. Newton Step. We use a globalized, inexact, reduced space (Gauss-)Newton-Krylov method for 
numerical optimization (see §4); we solve ( 8 ) using a Newton linearization. We have to compute variations 
of the weak form of the optimality conditions in ( 8 ), i.e., second variations of the Lagrangian in (7), to 
obtain the associated KKT system. Following the standard theory of calculus of variations, invoking the 
appropriate Green's identities (integration by parts), and adhering to the fact that we consider a reduced 


9 For A = — A, 7 = 1 , and w = 0 we obtain a linear Stokes regularization model (i.e., a model for incompressible diffeomor- 
phisms; see [60]). 

10 The control equation provides the reduced L 2 gradient (variation of the objective with respect to u); by preconditioned gradient 
descent we mean that we use the gradient in the Sobolev space V for numerical optimization. 
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space method, we arrive at the following system (which corresponds to the strong form of the second 
variations of the Lagrangian in (7)): 


(11a) 

dtm + Vm • v + Vra • v = 0 

in Q x (0,1], 

(lib) 

m = 0 

in Q x {0}, 

(11c) 

-d t A - V • (kv) - V • (Av) = 0 

in Q x [0,1), 

(lid) 

A + m = 0 

in Q x {1}, 

(lie) 

< 

1 

II 

o 

in Q, 

(Ilf) 

PvB[v\ +?Vp + h = -g v 

in Q, 

(Hg) 

y(p w (-A w + w) + p) = -g w 

in Q, 


with periodic boundary conditions on 9Q and incremental body force 

b= [ AVra + AVradf. 

Jo 

We refer to (11a) (with initial condition (lib)), (11c) (with final condition (lid)), (Ilf) and (llg) as incre¬ 
mental state, incremental adjoint , and incremental control equations , respectively. The incremental variables 
are denoted with a tilde. The incremental control equations (llg) and (Ilf) represent the action of the 
reduced space Hessian operators on the control variables (Hessian matvec). The right-hand sides in (lie) 
and (llg) correspond to the reduced gradients in (8g) and (8f), respectively. 

The operator B is the second variation of the regularization model acting on v. It coincides with the 
first variations in (9) for the quadratic regularization models in (5). This also holds for the second variation 
of the H 1 -norm in (4) (see (llg)). The second variation for the nonlinear regularization model in (6) does 
not coincide with its first variation; we obtain 


( 12 ) 




—V • 2 rj[v\ 



1 — v £[v] ® £[v] 
v £[v] : £[v] 

s -V-' 

= :QM 


m 


instead, where ® is the tensor outer product and X is a fourth order identity tensor. 

We can significantly simplify these systems by exploiting variable elimination techniques. This allows 
us to merely iterate on the reduced space of the velocity field v. This is what we discuss next. 

3.3. Reduced Systems. We eliminate the control and adjoint variables w and p, and by that the con¬ 
straint on the divergence of the velocity field v from the optimality system (8). The systems we provide 
below are the ones we solve numerically (see §E). We provide details on their derivation in §A. We arrive 
at 


(13a) 

dtm + Vm • v = 0 

in Q x (0,1], 

(13b) 

3 

1 

3 

H 

II 

O 

in Q x {0}, 

(13c) 

-a f A - V • (Aw) = 0 

in Q x [0,1), 

(13d) 

A + (m — tur) = 0 

in Q x {1}, 

(13e) 

g := p v A[v] + /C[bj = 0 

in Q, 


with periodic boundary conditions on 9Q to replace (8). The operator A corresponds to the first variation 
of the regularization models. The operator JC projects v onto the space of near-incompressible velocity 
fields. If we consider the regularization models in (5) we have 

(14) JC[b] = - VAf'A-'V'b + i, where M — + id)) -1 + id, 

and id is the identity operator. If we set w = 0 this operator simplifies to M = id. If we consider (6) 
instead, we have 


K.[b,v] = VAT 1 A _1 V • (V • 2 faf)[v]£[v] -b) + b. 
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where M = 2fi v fj[v\ (J3 W (—A + id)) -1 + id. If we set w = 0 we obtain M = id. 

The system no longer depends on w and p. This allows us to efficiently solve (3); we only iterate on 
the reduced space of the velocity field v. Computing variations of the weak form of (13) yields the Newton 


step 



(15a) 

3 tfh + Vm • v + Vra • v = 0 

in Q x (0,1], 

(15b) 

m = 0 

in Q x {0}, 

(15c) 

—3*A - V • (Xv) - V • (Av) = 0 

in Q x [0,1), 

(15d) 

A + m = 0 

in Q x {1}, 

(15e) 

nv := p v B[v] + C[b} = -g 

in Q, 


with periodic boundary conditions on 3Q. Here, g in (15e) corresponds to the reduced gradient in (13e). 
The operator B is the second variation of the regularization model (see §3). The projection operator C 
coincides with K in (14) if we consider the seminorms in (5) as a regularization operator. If we consider (6) 
instead, we have 

(16) jC(v)[b,v\ = VA1 _1 A -1 V • (V • 2fi v [fj[v] +rj[v]Q[z ?]) £[v\ — b) + b, 

where the operator M is as defined above and the operator Q is given in (12). Our algorithm only operates 
on these reduced systems. We discuss its implementation, and by that the scheme to ultimately solve (3), 
next. 

4. Numerics. Our numerical scheme was originally described in [60]. We normalize the intensities 
of the images to [0,1]. We use the trapezoidal rule for numerical quadrature and an explicit second 
order Runge-Kutta method for the numerical time integration of the hyperbolic PDEs in (13) and (15), 
respectively. Due to the conditional stability (CFL condition) we have to restrict the time step size hf. Given 
that we invert for a stationary velocity field v, we can modify the number of time steps nt as required. We 
use a pseudospectral discretization with a Fourier basis in space. This allows us to efficiently construct 
the inverse differential operators that appear in our formulation in §3.3. 

Images are in general functions of bounded variation; our scheme cannot handle this type of dis¬ 
continuity in the data. Accordingly, we assume that the images are adequately smooth. We ensure this 
numerically by pre-smoothing the data, a common strategy considered in many registration packages. 11 
We use a globalized, inexact [27, 31], preconditioned, matrix-free, reduced space (Gauss-)Newton-Krylov 
method for numerical optimization [60]. This scheme amounts to a sequential solution of the systems (13) 
and (15). The Newton step is in a general format given by 

(17) M\v\ - -g\, v h k+l = ^ + aitWfc, k = 1, 2,... 

where T~L\ E R nxn , n E N, is a discrete representation of the reduced Hessian in (15e) acting on the 
incremental control variable v\ E R n at (outer) iteration k. The scheme is globalized via a backtracking 
line search subject to the Armijo condition (we use default parameters; see [70, algorithm 3.1, p. 37]). We 
iteratively solve (17) using a PCG method. We refer to the solution of (17) as inner iterations (as opposed 
to the steps for updating to which we refer to as outer iterations). We ensure that the reduced space 
Hessian operator is positive definite by exploiting a Gauss-Newton approximation to the true Hessian. 
This corresponds to setting A in (15) to zero (see also [60]). The preconditioner for the reduced space 
Hessian is the inverse of the second variation of the regularization operator. This preconditioner has 
vanishing construction costs, due to the pseudospectral discretization in space. 

We provide more details on this algorithm in the appendix §E; we also refer to [60] for a detailed 
algorithmic study of our Newton-Krylov scheme in the context of large deformation diffeomorphic image 
registration; this includes a comparison to a preconditioned gradient descent scheme. 

11 An inadequate smoothness of the data can deteriorate the accuracy of the solver. We numerically ensure stability by enforcing 
sufficient regularity of the data and the velocity field v based on a combination of pre-smoothing of the input data and an adequate 
choice of the regularization weight The adjoint equation plays a critical role; in our formulation, we have to differentiate the 
Lagrange multiplier. As pointed out in [92], we can ensure numerical feasibility if we ensure that the input images are adequately 
smooth. Further strategies include the use of another scheme for numerical time integration or to use a map-based formulation [43]. 
We will investigate this in the future. 
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Fig. 1. Real-world two-dimensional registration problems. We display (from left to right) the reference image mR (fixed image), the 
template image mj (image to be registered), and a map of the residual differences between mR and mj before registration (for each set of images as 
indicated by the inset). Top left: benchmark registration problem [2, 65, 67]; top right: intersubject registration problem; bottom left: longitudinal 
(intrasubject) registration problem; bottom right: intersubject registration problem [21] (for the latter data we have a ground truth based on 
annotations: segmentations of 32 anatomical gray matter regions of interest; we overlay the associated label maps onto the reference and template 
image). 


5. Numerical Experiments. We study the performance of the proposed formulation in different ap¬ 
plication scenarios, accounting for synthetic and real-world registration problems. All results reported in 
this study are limited to the two-dimensional case. Nothing in our formulation is specific to d = 2; a 
three-dimensional implementation is ongoing work. 

We limit the first part of this study in §5.1 to the quadratic regularization norms in (5). Results for the 
nonlinear regularization model in (6) are reported in §5.2. 

5.1. Quadratic Regularization. We report different measures of registration performance, with the 
aim to assess both, the fidelity of the registration as well as properties of the computed deformation map. 
We report results for different two-dimensional real-world registration problems. 

5.1.1. Registration Performance. Purpose : We study registration quality as a function of the regular¬ 
ization parameters (smoothness) and [5 W (incompressiblity). 

Setup : All images are registered in full resolution. No grid, scale, or parameter continuation is per¬ 
formed. We terminate the optimization if the relative reduction of the gradient is at least three orders 
of magnitude. We consider three two-dimensional, real-world registration problems: a benchmark prob¬ 
lem based on hand images 12 [2, 65, 67] as well as a multisubject and a serial 13 (longitudinal) brain image 
registration problem ( multisubject brain images and serial brain images). The initial images are displayed 
in Fig. 1. 

The images have a grid size of 256 x 256. The number of time points is adapted as required by 
monitoring the CFL condition (initialized with nt = 2max(n x )). We vary and in steps of one 
order of magnitude ranging from IE—5 to IE—1, respectively. If we further reduce the regularization 
parameters the problem becomes computationally prohibitive (due to ill-conditioning) and numerically 
unstable (we not only violate the theoretical smoothness assumptions [19, 30] but also approach regimes 
that are numerically unstable; this will eventually result in irregular, nondiffeomorphic mappings); smaller 
values for the regularization parameters require finer grids to resolve the problem. We terminate the 
optimization if the relative change of the £°°-norm of the reduced gradient is at least three orders of 
magnitude. We compare the designed framework for near-incompressible registration to plain smoothness 
regularization and a model for incompressible diffeomorphisms. 

Results : Quantitative results are summarized in Tab. 2. Exemplary results for the hand images , the 
multisubject brain images and the serial brain images are illustrated in Fig. 2, Fig. 3 and Fig. 4, respectively. 


12 The images are taken from [67]. 

13 The data is available at http://central.xnat.org (open access series of imaging studies (OASIS) longitudinal study; dataset 70; 
time point one and time point four). The images have been affinely preregistered [53] and skull stripped [84]. 
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Table 2 

Quantitative analysis of registration performance as a function of the regularization parameters and f> w . The registration problem are the 
hand images, the multisubject brain images, and the serial brain images (see Fig. 1). We report results for different regularization schemes: 
We report (i) results for smoothness regularization without a constraint on the divergence of the velocity field (H 1 - and H 2 -regularization on v; 
7 = 0), (ii) for incompressible diffeomorphisms (H 1 -regularization on v; 7 = 1), and (iii) the proposed model with local adaptive compression 
(H 1 -regularization on v and w; 7 = 1). We report values for (i) the number of Hessian matrix vector products (n ma t vec ), (ii) the number of 
hyperbolic PDE solves (np de), (iii) the relative reduction of the gradient (||g*|| r eij, (iv) the relative reduction of the mismatch (||r*|| rel ), (v) min, 
max and mean values of the determinant of the deformation gradient /, and (vi) the max and mean distance of the deformation gradient from 
identity D (indicating the distance from a completely rigid mapping). The definitions of these measures can be found in Tab. 5 in §F in the 
appendix. 


run 

IMIv 

7 


fiw 

%v 

ftPDE 

llsllrel 

HU 

min(J) 

mean (/) 

max(J) 

mean(D) 

max(D) 








hand 

images 






#1 

H 2 

0 

IE—1 

n/a 

145 

328 

6.51E-4 

2.57E-1 

7.76E-1 

1.02 

1.33 

1.59E-1 

3.37E-1 

#2 



IE—2 


475 

999 

7.95E-4 

1.22E-1 

6.90E-1 

1.05 

1.73 

2.53E-1 

5.54E-1 

#3 



IE—3 


2435 

4972 

8.86E-4 

8.41E-2 

5.98E-1 

1.07 

1.99 

3.17E-1 

8.79E-1 

#4 



IE—4 


15189 

30648 

9.71E-4 

5.46E-2 

2.90E-1 

1.11 

2.45 

4.07E-1 

1.32 

#5 

H 1 

0 

IE—1 

n/a 

187 

419 

7.95E-4 

1.60E-1 

6.54E-1 

1.02 

1.64 

1.61E-1 

5.11E—1 

#6 



IE—2 


1560 

3291 

9.46E-4 

5.56E-2 

2.72E-1 

1.06 

2.61 

2.63E-1 

2.14 

#7 

H 1 

1 

IE—1 

n/a 

392 

868 

9.28E-4 

3.89E-1 

10E-1 

1.00 

1.00 

1.64E-1 

7.73E-1 

#8 



IE—2 


1011 

2116 

8.74E-4 

2.38E-1 

9.99E-1 

1.00 

1.00 

2.99E-1 

1.56 

#9 

H 1 

1 

IE—1 

IE—1 

306 

675 

8.98E-4 

2.63E-1 

9.00E-1 

1.00 

1.14 

1.47E-1 

5.78E-1 

#10 




IE—2 

273 

603 

7.75E-4 

1.95E-1 

8.00E-1 

1.01 

1.34 

1.54E-1 

4.51E-1 

#11 




IE—3 

241 

537 

8.80E-4 

1.70E-1 

7.31E-1 

1.02 

1.51 

1.59E-1 

4.55E-1 

#12 




IE—4 

212 

472 

7.89E-4 

1.62E-1 

7.00E-1 

1.02 

1.61 

1.60E-1 

4.81E-1 

#13 




IE—5 

192 

429 

8.49E-4 

1.60E-1 

6.61E-1 

1.02 

1.64 

1.61E-1 

5.04E-1 

#14 



IE—2 

IE—1 

1819 

3756 

9.96E-4 

1.54E-1 

9.44E-1 

1.00 

1.18 

2.81E-1 

1.41 

#15 




IE—2 

1466 

3025 

9.81E-4 

1.03E-1 

8.39E-1 

1.02 

1.43 

2.64E-1 

1.10 

#16 




IE—3 

1385 

2871 

9.72E-4 

8.29E-2 

6.69E-1 

1.03 

1.70 

2.51E-1 

1.05 

#17 




IE—4 

1483 

3083 

9.35E-4 

6.74E-2 

5.14E-1 

1.05 

1.84 

2.55E-1 

1.24 

#18 




IE—5 

1404 

2927 

9.23E-4 

5.87E-2 

3.44E-1 

1.05 

2.34 

2.59E-1 

1.52 


multisubject brain images 


#19 

#20 

H 2 

0 

IE—1 
IE—2 

n/a 

7538 

5685 

16223 

11855 

9.88E-4 

9.73E-4 

7.35E-1 

4.49E-1 

5.92E-1 

3.08E-1 

1.02 

1.09 

1.33 

2.68 

1.25E-1 

2.79E-1 

3.66E-1 

1.28 

#21 

H 1 

1 

IE—1 

IE—1 

3050 

6722 

9.99E-4 

7.22E-1 

6.92E-1 

1.01 

1.18 

1.47E-1 

6.75E-1 

#22 




IE—2 

3613 

7971 

9.84E-4 

4.90E-1 

4.18E-1 

1.04 

1.84 

2.05E-1 

1.11 

#23 




IE—3 

2107 

4627 

9.73E-4 

3.52E-1 

3.06E-1 

1.07 

2.15 

2.47E-1 

1.71 

#24 




IE—4 

2305 

4985 

9.79E-4 

2.95E-1 

1.67E-1 

1.09 

2.51 

2.62E-1 

1.86 

#25 




IE—5 

2703 

5858 

10E-4 

2.79E-1 

1.26E-1 

1.09 

2.72 

2.65E-1 

1.87 

serial brain images 

#26 

H 2 

0 

IE—1 

n/a 

67 

151 

7.24E-4 

6.72E-1 

9.04E-1 

1.00 

1.08 

2.07E-2 

1.18E-1 

#27 



IE—2 


169 

350 

3.99E-4 

4.59E-1 

7.39E-1 

1.00 

1.15 

3.94E-2 

3.26E-1 

#28 



IE—3 


1286 

2607 

9.10E-4 

2.98E-1 

5.14E-1 

1.01 

1.30 

7.12E-2 

9.23E-1 

#29 



IE—4 


13772 

27696 

1.25E-4 

1.87E-1 

2.62E-1 

1.01 

1.74 

1.26E-1 

1.95 

#30 

H 1 

1 

IE—1 

IE—1 

103 

227 

3.41E-4 

5.45E-1 

9.16E-1 

1.00 

1.06 

3.69E-2 

3.58E-1 

#31 




IE—2 

78 

173 

8.50E-4 

4.18E-1 

7.87E-1 

1.00 

1.13 

3.82E-2 

4.31E-1 

#32 




IE—3 

107 

235 

8.99E-4 

3.42E-1 

6.76E-1 

1.00 

1.29 

4.00E-2 

7.30E-1 

#33 




IE—4 

117 

258 

8.47E-4 

3.08E-1 

6.23E-1 

1.00 

1.69 

4.16E-2 

1.10 

#34 




IE—5 

177 

390 

8.48E-4 

2.98E-1 

6.04E-1 

1.00 

1.88 

4.26E-2 

1.29 


Observations : The most important observations are the following: Augmenting smoothness regular¬ 
ization with a constraint on V • v with a nonzero right-hand side w (mass source) allows us to control the 
magnitude of the determinant of the deformation gradient without compromising registration quality. We 
avoid over smoothing of the deformation map y. 

Enforcing incompressibility up to the numerical error is not adequate for the considered registration 
problems. This is also reflected by the residual differences reported in Tab. 2. Using a plain H 1 -seminorm 
as a regularization model (with no control on V • v) can be delicate: small variations in the regularization 
parameter yield strong variations in the determinant of the deformation gradient. The divergence 
constraint allows us to better control the mapping. The trend of the values for det(Fi) as a function of 
and f) W demonstrates that we can precisely control the regularity properties of the mapping y. 

In some cases we can—as compared to plain smoothness regularization—significantly reduce the 
variations of the determinant of the deformation gradient without compromising registration quality. For 
instance in run #16 in Tab. 2 we set f5 v to IE—2 and f$ w to IE—3 and obtain an L 2 -distance of 8.29E—2 
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Fig. 2. Exemplary registration results for the hand images (see Fig. 1). We report representative results from Tab. 2. The first three rows 
show results for plain smoothness regularization (7 = 0; first and second row: H 2 -regularization; third row: H 1 -regularization) for different 
choices of f v (top row: = IE—3; second row: f> v = IE—4; third row: f v = IE— 2). The two rows from the bottom show results for a model 

with local adaptive compression (H 1 -regularization ; 7 = 1 ) for = IE—2 and different choices for fi w (bottom row: f >2 = IE—4; second row 
from the bottom: f> w = IE—3). We show (from left to right) (i) the residual differences after registration , (ii) a map of the determinant of the 
deformation gradient (the values are reported in Tab. 2; the color map is explained in §D of the appendix), (iii) the deformed template image m\ 
with a grid in overlay, and (iv) a close up of the latter for a particular area of interest (as identified by the inset in the images). 


with det(Fi) £ [6.69E—1,1.70]. The maximum and mean distance of the deformation gradient from 
identity is 2.51E—1 and 1.05, respectively. If we want to obtain a similar residual using plain smoothness 
regularization on the basis of an H 1 -seminorm, we have to set to IE—2 (run #6 in Tab. 2). This results 
in a relative change of the L 2 -distance of 5.56E—2. However, the variation of the determinant of the 
deformation gradient is larger with det(Fi) £ [2.72E—1,2.61]. If we use an H 2 -seminorm we achieve a 
similar mismatch (relative reduction of the L 2 -distance by 8.4IE—2) for — IE—3 (run #3 in Tab. 2). The 
variation in the determinant of the deformation gradient is slightly larger with det(Fi) £ [5.98E—1,1.99] 
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Fig. 3. Exemplary registration results for the multisubject brain images (see Fig. 1). We report representative results from Tab. 2. 
We report results for plain smoothness regularization (top row; H 2 -regularization; 7 = 0; fi v = IE— 2) and for a model with local adaptive 
compression (bottom row; H 1 -regularization; 7 = 1; fi v = IE—1; fi w = IE—4). We display (from left to right) (i) a map of the residual 
differences after registration, (ii) a map of the determinant of the deformation gradient (the values are reported in Tab. 2; information about the 
color map can be found in §D of the appendix), (iii) the deformed template image m\ with a grid in overlay (to illustrate the deformation map y), 
and (iv) a close up of the latter for a particular area of interest (as identified by the inset in the images). 



Fig. 4. Exemplary registration results for the serial brain images (see Fig. 1). We compare plain smoothness regularization based on an 
H 2 -seminorm (top row; f> v = IE—3; 7 = 0) to the designed model with local adaptive compression (bottom row; H 1 -regularization; f> v = IE—1; 
fi w = IE—4; 7 = 1). We display (from left to right) (i) a map of the residual differences between the reference image mR and the deformed 
template m\, (ii) a map of the determinant of the deformation gradient (the values are reported in Tab. 2; information about the color map can be 
found in §D of the appendix; notice that we changed the window to [0.5,1.5], since the volume changes between the images are subtle), (iii) a 
deformed grid overlaid onto the deformed template image m\ (to illustrate the deformation map y) and (iv) a close up of the latter for a particular 
area of interest (as identified by the inset in the images). 


(as compared to the near-incompressible case). 

Careful visual inspection of the results in Fig. 2 confirms these findings. The residual differences 
are very similar for all models. We can also see that if we set to IE—2 or IE—4 for plain H 1 - and 
H 2 -regularization (i.e. without additional constraint on the divergence of v; runs #4 and #6 in Tab. 2), 
we seem to overfit the data; the mapping becomes more and more ill-behaved. By setting to IE—3 
for the H 2 -seminorm or by using a near-incompressible model with (5 V = IE—2 a nice diffeomorphism is 
obtained. 
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For the hand images we obtain an equivalent performance for the H 1 - and H 2 -regularization models 
since the mapping between both images is rather smooth. This is different for the multisubject brain images 
(see Fig. 3). The H 2 -seminorm yields a nice diffeomorphic map but y also appears to be overly smooth. 
That is, we observe a strong blurring in the map of the determinant of the deformation gradient. Thus, it 
is not possible to recover fine features in the deformation field. The same behavior can be observed for the 
serial brain images. Although the residual differences are very similar for the H 1 - and the H 2 -seminorm, 
we obtain mappings that are locally very different. If we use an H 1 -regularization we can recover much 
more localized features in the deformation map. These local changes could be of interest in a subsequent 
analysis of the local deformation properties (volume changes; deformation based morphometry). Also 
note that the mean values for the determinant of the deformation gradient are closer to one (i.e. volume is 
more likely preserved) as compared to plain smoothness regularization. 

When switching from H 1 - to an H 2 -regularization model we have to reduce by one order of mag¬ 
nitude to obtain a similar mismatch. Note that the computational complexity of our scheme is currently 
not mesh independent; the rate of convergence deteriorates significantly if we reduce as judged by the 
number of Hessian matrix vector products and hyperbolic PDE solves. 

Conclusions : Using an H 2 -regularization model—a common choice in large deformation diffeomor¬ 
phic registration algorithms—yields well behaved mappings. However, we might loose local features (fine 
structures) in the deformation map due to oversmoothing. These features could be of importance for a 
subsequent analysis of the deformation map. Empirically, we observed that we have to reduce by one 
order of magnitude for the H 2 -regularization model as compared to the H 1 -seminorm to obtain a similar 
mismatch. If we reduce significantly the computational work load for the inversion (with the defined 
tolerance) becomes prohibitive. If we switch to an H 1 -seminorm as a regularization model we can resolve 
fine features in the deformation. Introducing a constraint on the divergence of the velocity field with a 
nonzero mass source w allows us to explicitly control the magnitude of the determinant of the deformation 
gradient without compromising registration quality. This relaxation of the incompressibility constraint is 
critical to make our model applicable across a wide range of registration problems. 

5.1.2. Validation and Comparison. Purpose : We compare and validate registration performance of 
our algorithm. We report results for our formulation and the diffeomorphic DEMONS algorithm [90, 91]. 

Setup : Our evaluation is based on the data of the Nonrigid Image Registration Evaluation Project (NIREP) 
[21]. 14 NIREP is a standardized data repository for the validation of deformable image registration algo¬ 
rithms; we refer to [21] for details. We use the datasets naOl and na02 to study registration performance 
as a function of regularization parameters and norms (see Fig. 1). Since our implementation is only a 
two-dimensional prototype, we extract a single slice from both volumes (axial slice 128) and resample 
the data to a resolution of n x = (128,150) T (using a spline interpolation model for the image data and a 
nearest neighbor interpolation model for the label maps). For simplicity, we do not report results for the 
32 individual labels but combine them to a single gray matter label map. We report the JSC, the DSC, the 
FPE, and the FNE (see Tab. 5 in §F for the definitions). We also report values for the determinant of the 
deformation gradient. We limit the evaluation of the deformation gradient to the area occupied by the 
brain (identified by thresholding). 

We perform a parameter continuation in with bounds on the minimal tolerable determinant of 
the deformation gradient (binary search; see [60] for details) for our algorithm starting with f5 v = 1. We 
perform this binary search for fixed values of (we vary by one order of magnitude, starting with 
p w = IE—1; i.e., we perform an exhaustive search for the second regularization parameter. We do not 
perform an additional grid or scale continuation. We terminate the optimization if the relative reduction 
in the gradient is two orders of magnitude or more. 15 Once we have obtained the velocity field (i.e., we 
solved the inverse problem with an estimated, optimal combination of regularization parameters and 
f$ w ) we compute the determinant of the deformation gradient and transport the label maps (as a post¬ 
processing step). We do so at a grid size of 4 n x to be able to fully resolve the problem. Before we solve the 
transport problem we smooth the label maps using a Gaussian filter with standard deviation of 3 h x (to 
avoid Gibbs phenomena). We threshold the transported label maps at a threshold of 0.5 to obtain binary 


14 The data is available at http: //nirep. org and described in detail in [21]. 

15 We use a larger tolerance in these experiments because we did not observe any differences in the results if we turned to smaller 
tolerances. 
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labels and map them to the original resolution level by injection. We subsequently compute the overlap 
between the reference and transported template label maps. 

The publicly available DEMONS algorithm 16 does not provide any stopping conditions other than the 
number of iterations. We tested several settings for the number of iterations in combination with a varying 
number of multiresolution levels. We observed that an increase in the number of iterations does not neces¬ 
sarily improve the obtained results; as a matter of fact the results can deteriorate for certain regularization 
parameter combinations if the iterations per resolution level are increased (i.e., the algorithm diverges). 
After these initial experiments we decided to fix the number of iterations and any other settings to the 
default values suggested in the documentation of the code (three grid resolution levels with 15, 10, and 5 
iterations, respectively, with a diffeomorphic update rule based on an exponential map [91] and symmet¬ 
ric gradient forces). We study the registration accuracy of the DEMONS algorithm as a function of the two 
regularization parameters a u (smoothing for the update field; fluid-like regularization) and cr^ (smoothing 
for the deformation field; diffusive regularization). We report results for (i) the mixed case, (ii) pure diffusive 
regularization, and (iii) pure fluid-like regularization. We report results for high data fidelity and low defor¬ 
mation regularity as well as results for low data fidelity and high deformation regularity. We apply the 
obtained mapping to the label maps using a nearest neighbor interpolation model. 

For our algorithm, we use the parameters from the run above to extend our analysis to the remaining 
NIREP datasets. That is, we use the values for the regularization parameters f> v and f> w that resulted in 
the best DSC scores for the registration between naOl and na02 to register all the remaining images to 
naOl. For the DEMONS algorithm we use two parameter settings for and cr u \ one setting that results in 
the best DSC score (under the constraint that the deformation has to be diffeomorphic) and one setting 
that matches the determinant of the deformation gradient delivered by our method. 

Results: We summarize the quantitative results for the datasets naOl and na02 in Tab. 3. We pro¬ 
vide qualitative results in Fig. 5. We report the results for the remaining NIREP datasets in Tab. 4; we 
summarize this experiment in Fig. 6. 

Observations: The most important observation is that our framework allows us to generate diffeo¬ 
morphic maps that are much better behaved with a higher data fidelity, at the cost of a significant increase 
in computational work as compared to the DEMONS algorithm. 

The DEMONS algorithm is much more efficient than our approach. The time to solution is significantly 
faster than for our prototype implementation. 17 Overall, we trade numerical accuracy and convergence 
guarantees against computational complexity. The obtained maps are for most of the combinations for 
cr u and diffeomorphic as judged by the determinant of the deformation gradient. The highest diffeo¬ 
morphic DSC we could achieve for the DEMONS algorithm is 7.99E—1 for the diffusive regularization (run #5 
in Tab. 3) and 8.06E—1 for the fluid-like regularization (run #22 in Tab. 3). The min/mean/max values for 
the determinant of the deformation gradient are 2.17E—1/1.83/6.06E1 and 1.50E—1/3.67/2.63E2, respec¬ 
tively. Our formulation allows us to obtain similar (run #30 in Tab. 3) or even better values for the DSC 
(run #26, run #27, and run #31 in Tab. 3) with much more well-behaved deformation maps (as judged by 
the determinant of the deformation gradient). 

For run #30 we obtain almost the same DSC with det(Fi) E [6.46E—1,1.94]. Further, we can observe 
that across almost all runs the mean values for det(Fi) are much closer to one for the linear Stokes 
regularization case. The H 2 -regularization also results in better behaved deformation maps. However, 
the variations increase significantly as we turn to a higher data fidelity. Our approach results in maps for 
which the maximal value of the determinant of the deformation gradient is (for the most part) much better 
behaved than for the DEMONS algorithm. This is important since the Jacobian of the inverse deformation 
map will have very small values if the maximum for the reported values is large. For instance, for the best 


16 We use the public implementation found at http://hdl.handle.net/1926/510 (see [90]; compiled with ITK4). 

17 The runtime of our solver depends on the choice of the regularization norm and weight, and the complexity of the registration 
problem. Considering the test problem in Tab. 3 we obtain the following timings for a fixed regularization parameter on a Linux 
machine with Intel Xeon X5650 Westmere EP 6-core processors at 2.67GHz with 24GB DDR3-1333 memory (stopping condition: 
reduction of the gradient by two orders of magnitude): 15 minutes for f$ v = IE—2, 60 minutes for f$ v = IE—3, and 300 minutes 
for f$ v m IE—4 for an H 2 -regularization model (compressible diffeomorphism), respectively; 1 minute for f$ v = 5E—1, [$ w = IE—4, 
15 minutes for = 5E—2, f$ w = IE—4, and 75 minutes for = 5E—3, f3 w = IE—4, for the H 1 Stokes regularization scheme, 
respectively. Applying the DEMONS scheme takes only a few seconds. We note that our solver is not finalized; we are currently 
working on an improved solver in an effort to make it competitive with existing software for diffeomorphic image registration. We 
will report a detailed performance analysis for this solver elsewhere. 
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Table 3 

Performance evaluation. We compare registration quality between the diffeomorphic DEMONS algorithm (top block) and the proposed algo¬ 
rithm (bottom block; plain H 2 -smoothness regularization and linear Stokes regularization (LS)). We perform a parameter continuation in 
with lower bounds on the determinant of the deformation gradient for our approaches (the bound is 0.1 for LS and 0.5, 0.2, and 0.1 for the 
H 2 -regularization model). For the DEMONS algorithm we study registration quality as a function of the regularization parameters cr u (standard 
deviation for the smoothing of the update field) and (standard deviation for the smoothing of the deformation field); i.e., we perform an ex¬ 
haustive parameter search. We report values for the JSC, the DSC, the FPE, and the FNE. We also report min, mean, and max values for the 
determinant of the deformation gradient J. We highlight the best results for each approach in bold. We highlight results that are identified to be 
nondiffeomorphic in faint gray color. 


run j6 v ;cr u ftw'/Cd JSC DSC FPE FNE min(/) mean(/) max(J) 

DEMONS 


#1 EXP 

0. 

4.00 

5.23E-1 

6.87E-1 

3.50E-1 

2.94E-1 

7.42E-1 

1.08 

1.55 

#2 

0. 

3.00 

5.58E-1 

7.16E-1 

3.09E-1 

2.70E-1 

6.18E-1 

1.10 

2.29 

#3 

0. 

2.00 

6.00E-1 

7.50E-1 

2.46E-1 

2.52E-1 

4.22E-1 

1.19 

7.12 

#4 

0. 

1.50 

6.38E-1 

7.79E-1 

2.03E-1 

2.33E-1 

2.97E-1 

1.40 

2.36E1 

#5 

0. 

1.00 

6.65E-1 

7.99E-1 

1.74E-1 

2.20E-1 

2.17E-1 

1.83 

6.06E1 

#6 

0. 

5.00E-1 

7.33E-1 

8.46E-1 

1.20E-1 

1.78E-1 

-4.58E1 

3.76E1 

2.16E4 

#7 

5.00E-1 

4.00 

5.23E-1 

6.86E-1 

3.51E-1 

2.94E-1 

7.42E-1 

1.08 

1.55 

#8 

5.00E-1 

3.00 

5.56E-1 

7.15E-1 

3.10E-1 

2.71E-1 

6.18E-1 

1.10 

2.28 

#9 

5.00E-1 

2.00 

5.98E-1 

7.48E-1 

2.48E-1 

2.54E-1 

4.22E-1 

1.18 

7.03 

#10 

5.00E-1 

1.50 

6.34E-1 

7.76E-1 

2.05E-1 

2.37E-1 

3.02E-1 

1.39 

2.22E1 

#11 

5.00E-1 

1.00 

6.58E-1 

7.94E-1 

1.80E-1 

2.24E-1 

2.27E-1 

1.80 

5.78E1 

#12 

5.00E-1 

5.00E-1 

7.23E-1 

8.39E-1 

1.27E-1 

1.86E-1 

-1.38E2 

3.30E1 

1.27E4 

#13 

5.00E-1 

0. 

7.19E-1 

8.37E-1 

1.36E-1 

1.83E-1 

—2.37E10 

9.85E7 

8.40E11 

#14 

1.00 

4.00 

5.22E-1 

6.86E-1 

3.49E-1 

2.96E-1 

7.44E-1 

1.08 

1.54 

#15 

1.00 

3.00 

5.54E-1 

7.13E-1 

3.15E-1 

2.72E-1 

6.22E-1 

1.10 

2.22 

#16 

1.00 

2.00 

5.95E-1 

7.46E-1 

2.53E-1 

2.54E-1 

4.30E-1 

1.18 

6.32 

#17 

1.00 

1.50 

6.21E-1 

7.66E-1 

2.17E-1 

2.44E-1 

3.19E-1 

1.34 

1.67E1 

#18 

1.00 

1.00 

6.50E-1 

7.88E-1 

1.88E-1 

2.27E-1 

2.55E-1 

1.66 

4.18E1 

#19 

1.00 

5.00E-1 

7.05E-1 

8.27E-1 

1.40E-1 

1.96E-1 

-9.66 

1.36E1 

3.29E3 

#20 

1.00 

0. 

7.21E-1 

8.38E-1 

1.30E-1 

1.86E-1 

-2.17E7 

8.55E3 

1.09E7 

#21 

1.50 

0. 

7.07E-1 

8.28E-1 

1.38E-1 

1.95E-1 

-1.02E2 

7.60E1 

2.56E4 

#22 

2.00 

0. 

6.76E-1 

8.06E-1 

1.63E-1 

2.14E-1 

1.50E-1 

3.67 

2.63E2 

#23 

3.00 

0. 

5.89E-1 

7.42E-1 

2.35E-1 

2.72E-1 

2.52E-1 

1.35 

1.11E1 

#24 

4.00 

0. 

5.60E-1 

7.18E-1 

2.73E-1 

2.86E-1 

4.56E-1 

1.16 

3.64 


PROPOSED 


#25 

H 2 

4.38E-4 

— 

6.29E-1 

7.72E-1 

2.66E-1 

2.04E-1 

5.16E-1 

1.15 

3.54 

#26 


7.75E-5 

— 

6.97E-1 

8.22E-1 

2.01E-1 

1.63E-1 

2.28E-1 

1.28 

7.98 

#27 


3.25E-5 

— 

7.26E-1 

8.41E-1 

1.70E-1 

1.51E-1 

1.44E-1 

1.38 

1.32E1 

#28 

LS 

2.13E-2 

1.00E-1 

5.70E-1 

7.26E-1 

3.21E-1 

2.47E-1 

9.48E-1 

1.06 

1.16 

#29 


5.50E-3 

1.00E-2 

6.32E-1 

7.75E-1 

2.73E-1 

1.95E-1 

9.00E-1 

1.07 

1.31 

#30 


4.94E-3 

1.00E-3 

6.65E-1 

7.99E-1 

2.44E-1 

1.73E-1 

6.46E-1 

1.09 

1.94 

#31 


4.94E-3 

1.00E-4 

7.03E-1 

8.25E-1 

2.06E-1 

1.53E-1 

2.79E-1 

1.16 

4.09 


runs for DEMONS max(det(Fi)) is equal to 6.06E1 (pure diffusive regularization ; run #5 in Tab. 3) and 2.63E2 
(pure fluid-like regularization ; run #22 in Tab. 3) as compared to 4.09 for the linear Stokes case (run #31 in 
Tab. 3; which in addition to that has a better DSC). If we compare the results with similar values for the 
deformation gradient (e.g., run #3 or run #24 in Tab. 3) we cannot achieve the same DSC scores as the 
PROPOSED algorithm delivers; we have to operate the DEMONS algorithm at regimes with large variations in 
the determinant of the deformation gradient to obtain DSC scores that are equivalent to those achieved 
with our algorithm. 

If we consider the results for the remaining datasets in Tab. 4 we observe a similar behavior (for fixed 
parameters for both algorithms). We obtain slightly better DSC scores for the DEMONS algorithm (see also 
Fig. 6) at the cost of a larger variations in the determinant of the deformation gradient. If we increase the 
regularization we can reproduce similar values for the determinant of the deformation gradient but are 
not able to achieve the same data fidelity as judged by the DSC scores. Fig. 6 shows that these differences 
are on average consistent across all datasets. 

Conclusions : We have conducted a preliminary two-dimensional study of registration quality based 
on the NIREP data. All approaches deliver diffeomorphic maps with a good data fidelity. The DEMONS 
algorithm arrives at a solution significantly faster than our current prototype implementation. We note 
that our method has not been optimized for speed yet; there exist several ways to accelerate our algorithm, 
which we are currently investigating. We will report these improvements and the extension of our solver 
to three-dimensional problems elsewhere. This preliminary study suggests that our algorithm provides 
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Fig. 5. Performance evaluation. We report qualitative results for the NIREP data sets for our method (top row and middle row) and the 
DEMONS algorithm (bottom row). The displayed results correspond to the runs reported in Tab. 3 (H 2 -regularization: run #26; linear Stokes 
regularization: run #31; DEMONS: run #5 (diffusive regularization) and run #22 (fluid-like regularization); we use run run #26 instead of the best 
run (run #27) for the H 2 -regularization because it has a similar DSC as we obtain for the linear Stokes case). We report for each run (from left 
to right) (i) the residual differences after registration, (ii) a map of the determinant of the deformation gradient, and (iii) a deformed grid overlaid 
onto the deformed template image. We computed the displayed results on a finer grid (512 x 600) as compared to the one we have used to solve 
the optimization problem (128 x 150) to be able to visualize the obtained deformation maps accurately (since MATLAB uses linear interpolation 
for visualization and we are using a spectral basis). 


much more well-behaved mappings without compromising data fidelity as compared to the DEMONS algo¬ 
rithm. Overall, we trade numerical accuracy and convergence guarantees against computational efficiency 
(i.e., an increase in the time to solution). We consider these differences in registration quality a preliminary 
result; clearly, we have to validate the performance of our solver on three-dimensional data, extend the 
comparison to other algorithms, and reduce the time to solution to truly demonstrate that our formulation 
has the potential to impact the applied sciences. 

5.2. Nonlinear Stokes regularization (shear control). Purpose : We study the effect of controlling the 
shear in the deformation field in the presence of an expected "discontinuous" motion field. We compare 
results for the nonlinear Stokes regularization model to plain, quadratic smoothness regularization, and a 
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Fig. 6. Statistical quantification of registration quality across multiple datasets. We show box whisker plots to summarize the results 
reported in Tab. 4. We compare registration quality for the DEMONS algorithm and the PROPOSED algorithm. We report results for A: DEMONS, 
cr u = 0, a d = 1; B: DEMONS, a u = 2, a d = 0; C: DEMONS, a u = 0, cr d = 2; D: DEMONS, a u = 4, a d = 0; E: PROPOSED, linear Stokes 
regularization, f> v = 4.94E—3, f> w = IE—4. These parameters are the same as in Tab. 4; these parameters deliver results that are consistent with 
our formulation in terms of the DSC scores or the values for the determinant of the deformation gradient. We report the DSC scores (left) and the 
smallest (middle) and largest (right) values for the determinant of the deformation gradient. 



Fig. 7. Registration problems with an expected "discontinuous" motion field (sliding interfaces; left: sliding rectangles; right: sliding 
vent). We display (from left to right) the reference image m^ (fixed image), the template image mj (image to be registered), and a map of the 
residual differences between m^ and mj before registration (for each set of images as indicated by the inset). 


linear Stokes regularization model (incompressible diffeomorphism). 

Setup : We consider two synthetic problems for which we expect the deformation to contain large 
shear (see Fig. 7). The images have a grid size of 512 x 512. 18 We compare plain smoothness regularization 
based on an H 2 -seminorm (7 = 0) to models of incompressible flow (H 1 -regularization; 7 = 1 ). We study 
the qualitative behavior of the deformation map with respect to changes in the flow law exponent v for 
empirically chosen values for E {IE—2, IE—3}. In particular, we study shearthickening ( 1 / = 1/2) 
and shearthinning {v E {3,5}). We consider the full set of termination criteria used in [60] for this set of 
experiments with a tolerance of IE—3. No grid, scale or parameter continuation is performed. 

Results : We report exemplary results for the sliding rectangles and the sliding vent in Fig. 8, Fig. 9, and 
Fig. 10, respectively. We enforce incompressibility up to numerical accuracy. 

Observations : The most important observation is that the nonlinear Stokes regularization provides an 
adaptive control of the shear of the deformation field at the sliding interface. Setting v to a value in (0,1) 
increases the resistance to shear (shear thickening fluid). On the contrary, if we choose v > 1 we promote 
shear. The larger v the sharper the transition at the interface and the more localized the deformation. 
This confirms the theoretical statement that the model tends to a total variation regularization for v -A 00 . 
However, we can already recover sharp interfaces for small v (e.g., for v = 5; see Fig. 8 and Fig. 10 
bottom and, in particular. Fig. 9). The residual differences between the registered images are insignificant 
for varying parameters v. The computed mappings are very different; points close to the sliding interface 
map to completely different positions. We can model highly nonlinear deformations with a precise control 
on the determinant of the deformation gradient. Likewise to the linear case we can also extend this 
formulation by introducing a mass source w (see §A.2 in the appendix for details) rendering the flow 
near-incompressible (results not included in this study). This makes this approach applicable across a 
wider range of registration scenarios. 

Conclusion : The nonlinear Stokes regularization model allows us to promote or penalize large shear 
in the deformation field as required. As a consequence, we have—in contrast to traditional vectorial total 
variation—complete control on the smoothness properties of the deformation map and the determinant of 
the deformation gradient. Further, we—like in total variation regularization—do not have to identify the 


18 We use more grid points to be able to resolve the velocity field and avoid aliasing. 
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Fig. 8. Exemplary registration results for the sliding rectangles (see Fig. 7). We study the effect of shear control (nonlinear Stokes 
regularization). We compare plain H 2 -regularization (top row; 7 = 0; fi v = IE— 2) to a linear Stokes regularization model (third row from the 
top; H 1 -regularization; 7 = 1; f v = IE— 3) and a nonlinear Stokes regularization model (second row from the top: v = 1/2 (shear thickening; 
f v = IE—2); first and second row from the bottom: v G {3,5} (shear thinning; f> v = IE— 3)). We show (from left to right) (i) a map of the 
residual differences between the reference image mR and the deformed template m\, (ii) a map of the determinant of the deformation gradient, (iii) 
a deformed grid overlaid onto the deformed template image m\ (to illustrate the deformation map y), (iv) a close up of the latter for a particular 
area of interest, and (v) a single displacement vector at x = (4.66,3.25) (the location is indicated as a gray rectangle in the visualization of the 
deformed grid; the size of the box is 25 x 25 grid points). 


interfaces where the sliding motion is expected to occur (i.e. we do not require a presegmentation of the 
data). We note that promoting shear is only an approximation of true sliding motion, i.e., our formulation 
does not allow for computing "discontinuous" motion fields. In future studies we will compare our 
formulation against total variation regularization to better quantify the capabilities. 

6. Conclusions. We have introduced novel constrained regularization schemes for large deforma¬ 
tion diffeomorphic image registration that feature a local control of the divergence of the velocity and 
thus of the determinant of the deformation gradient (in a problem-dependent way). Our formulation 
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Fig. 9. Exemplary registration results for the sliding rectangles (see Fig. 7). We display the displacement field for the nonlinear Stokes 
regularization for v = 5 (bottom row in Fig. 8). We only show an detail of the displacement field in full resolution. On the left we illustrate the 
region of interest Cl. A closeup of this region of interest is provided on the right. 



Fig. 10. Exemplary registration results for the sliding vent (see Fig. 7). We study the effect of shear control for a highly nonlinear 
registration problem. We show results for a linear (top row; 7 = 1; v = 1) and a nonlinear (bottom row; 7 = 1; v = 5) Stokes regularization 
model. We show (from left to right) (i) a map of the residual differences between the reference image m^ and the deformed template m\, (ii) a map 
of the determinant of the deformation gradient, (Hi) a deformed grid overlaid onto the deformed template image m\ (to illustrate the deformation 
map y) and (iv) a close up of the latter for a particular area of interest. 


is founded on well-established computational models in fluid mechanics (Stokes flow). All results re¬ 
ported in this study are limited to the two-dimensional case. Nothing in our formulation is specific to the 
two-dimensional case; an extension to three dimensions is ongoing work in our group. 

We invert for a stationary velocity field. We achieve a similar or even better inversion quality (as 
judged by values for the residual differences and overlap between anatomical regions) as compared to 
available diffeomorphic registration models, while maintaining a better control over the deformation reg¬ 
ularity. Furthermore, several applications do require incompressible or near-incompressible deformations, 
for example, in medical imaging. Our framework provides such a technology. 

It is unclear how to theoretically determine the behavior of the proposed methods. For this reason 
we conducted experiments to probe their behavior. The basic conclusions from our experiments are the 
following: 

• Using an H 1 -seminorm as a regularization model without control of the determinant of the de¬ 
formation gradient is not robust. Either it produces uninformative maps (large regularization) or 
highly (perhaps unacceptably so) deformed maps. 

• The H 2 -seminorm without control of the determinant of the deformation gradient behaves well 
but its cost with decreasing regularization increases to regimes that make it not practical. 

• Our proposed H 1 -seminorm regularization plus control of the deformation gradient performs 
well. It delivers small mismatch values (comparable to the H 2 case without controlling det(Vy)) 
and smooth det(Vy) much faster than the H 2 scheme (e.g.. Tab. 2, run #4 versus run #16 and 
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run #20 versus run #23). It also delivers a good agreement between anatomical structures with 
an excellent control on det(Vi/) (e.g.. Tab. 3 run #29 and run #30). However, this scheme results 
in one additional regularization parameter (3 W and its calibration is expensive. This cost can be 
amortized in studies involving multiple images. 

So, the new formulation seems preferable in terms of robustness and speed. Further studies in three 
dimensions and on a larger set of images are necessary to confirm these results. 

We have also introduced a regularization model that allows us to control the shear in the deformation 
map. We can either promote (shear thinning) or penalize (shear thickening) shear in an attempt to approx¬ 
imate discontinuous motion fields or generate more well-behaved deformation maps. Our framework (/) is 
applicable to smooth and non-smooth registration problems, (ii) allows us to control the amount of shear, 
(iii) does not require a presegmentation of the data, and (iv) features a control of the determinant of the 
deformation gradient in a problem dependent way. We demonstrated that this regularization results in 
dramatically different deformation maps (see Fig. 7). 

Our solver is not finalized; its extension to three dimensions requires more work. The next steps will 
be the design of a more effective preconditioner, the implementation of a more effective scheme to solve 
the transport equations, and the application to problems that have time sequences of images. For such 
cases, a time-dependent velocity field will be necessary. 

Acknowledgments. We thank Georg Stadler and Johann Rudi for helpful discussions and sugges¬ 
tions. 

Appendix A. Variable Elimination: Derivations. 

Here, we provide the derivations of the variable elimination. We start with the linear Stokes regular¬ 
ization model (i.e., we consider the regularization operator A = — A and 7 = 1 in ( 8 )). 

A.l. Linear Stokes Regularization. The variable elimination for the linear, incompressible flow model 
(i.e.enforcing V • v = 0 up to numerical accuracy) can be found in [60]. Here, we extend on the formu¬ 
lation in [60] by introducing a mass source w into the divergence constraint. We eliminate p and w from 
the optimality system ( 8 ). This will result in an optimality system that allows us to only iterate on the 
reduced space of the control variable v. 

Applying the divergence to ( 8 f) yields V • (—^ v Av) + A p + V • b = 0. From the optimality condition 
V • v = w and the equivalence 

(18) Av = V(V • v) — V x (V x v) 
it follows that — fi v Aw + Ap = — V b. From ( 8 g) it follows that 

(19) w = -(^(-A + id)) -1 ^. 

Inserting this expression yields A(ji v (f5 w ( —A + id )) -1 + id)p = —V • b and therefore 

( 20 ) p — —(^ V (^ w (—A + id )) -1 + id) - 1 A -1 V • b. 

Inserting this expression into ( 8 f) yields the control equation 

( 21 ) -faAv -V(MM—A + id )) -1 + id ) -1 A -1 V b + b 

" -v-' 

=:JC[b] 

Note, that (21) is independent of the variables w and p. In addition, we have eliminated ( 8 e) and ( 8 g) 
from ( 8 ). We arrive at the optimality system (13). Computing second variations of the weak form of (13) 
yields (15) with the operator JC as defined in (21). If we consider an incompressible diffeomorphism 
(i.e., set w = 0 ) the control equation ( 21 ) simplifies to —^ v Av — VA -1 V • b + b = 0 [60]. Note that it 
is possible to replace the Laplacian operator with a biharmonic operator (i.e., consider an H 2 - instead 
of an H 1 -seminorm); the same arguments used above still hold. We can even use an H 3 -seminorm (the 
reduced gradient becomes a triharmonic equation) if theoretical considerations are of concern (see [19] or 
§B for a brief discussion; we provide exemplary results in Fig. 11). We limit ourselves in this work to an 
H 1 -seminorm, which results in a linear Stokes regularization model. 
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A.2. Nonlinear Stokes Regularization. We discuss the variable elimination techniques for the non¬ 
linear Stokes regularization next. We start with a model of incompressible flow (i.e., we assume that 

V • v = 0). The same arguments that have been used in the former section apply. However, since the 
viscosity is no longer a constant but a function of v, we have to decompose rj into 

t][v] =rj[v]+f}[v] = i H v ] + ^Jv[ v ] dx 

to be able to eliminate p. If we insert this decomposition into the control equation for v we obtain 
(22) —2fi v fj[v] V • £[v\ - V • ip v fj[v]£[v] + Vp + b = 0. 

The divergence of the strain rate tensor £[v] is identical to jAv under the incompressibility assumption 

V • v = 0. Accordingly, we have 


—fi v fj[v]Av — V • 2p v fj[v\£[v\ + Vp + b = 0. 

By taking advantage of (18) and applying the divergence we obtain 

-V • V • 2/3 V fj[v\£[v\ + A p + V • b = 0 

and therefore p = A -1 V • (V • 2f$ v r)[v]£[v] — b ). Inserting this expression into the control equation for v 
results in 

(23) g := -V • 2fari[v]£[v] + K%v\ = 0, 

where JC[b,v] = VA -1 V • (V • 2fi v f)[v]£[v] — b) + b. Computing second variations yields the incremental 
control equation 

(24) p v B(v)[v] + C(v)[b,v ] = -g. 

The operator B is the second variation of (6) given in (12) and 

£(v)[b,v] = VA _1 V • (V • (r)[v] +rj[v]Q[v}) £[v\ —b) + b. 

Next, we consider a nonzero mass source w (i.e., we relax the incompressibility constraint to V • v = w). 
In this case, the divergence of the strain rate tensor £[v] = ^((Vu) + (Vu) T ) is no longer proportional to 
Av. Instead, we have 


—fi v fj[v\(Av + Vw) — V • 2p v r)[v\£[v] + Vp + b = 0. 

Applying the divergence operator yields 

—2p v fj[v]Aw — V • V • 2f} v fj[v]£[v] + Ap + V • b = 0. 

Using (19) we can eliminate w. Thus, 

A(2M[p](M-A + id)) _1 +id)p - V • V • 2^[v]S[v] = -V-b 

^ ■V ' 1 

=:M 

and therefore p = A1 -1 A -1 V • (V -2 fi v f\[v]£[v] — b ). We have again found an expression for p that 
not only eliminates p but also the equation for the constraint on V • v, the control variable w, and the 
associated control equation for w. We obtain first order optimality conditions that are very similar to the 
incompressible case. The only difference is the form of the operator K in (23). In particular, 

JC[b,v] = V7W _1 A _1 V • (V • 2p v f)[v]£[v] -b) + b. 

It immediately follows that we arrive at (16) for the second variation. 
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Fig. 11. Exemplary results for a fully incompressible Stokes model (i.e., w = 0) for different regularization norms. We provide (from left to 
right) results for an H 1 -, an H 2 -, and an H 3 -seminorm for f> v = IE—2. 


Appendix B. Theoretical Considerations. 

An adequate choice for the regularization norm ensures the existence and uniqueness of a minimizer 
for J in (3a). A second requirement for an admissible solution of (3a) is that the map y generated by 
v G V is a diffeomorphism. In [30, 88 ] it is shown in the context of the large deformation diffeomorphic metric 
mapping formulation (which is related to our formulation; see [43, 60]) that y is diffeomorphic if we enforce 
a sufficient amount of smoothness on the elements of the space V. The order of the associated Sobolev 
norm depends on the dimensionality d of the ambient space; for d = 3 the norm )t(—A + id)')'i ;||2 

where 7 > 1.5, is adequate [10]. 

Our formulation operates in an incompressible or near-incompressible regime, i.e., v G {/ G H 1 (Q)^ : 
V • / = w, w G H 1 (Q)}. An excellent reference for the analysis of such flows is [26]. An existence proof 
for a minimizer of J in (3) for the nonstationary, incompressible case (i.e., w = 0 ) in two dimensions, for 
which the images are modeled as functions of bounded variation and the regularization model for v is an 
H 3 -seminorm, can be found in [19]. This proof is based on the premise that v propagates 23 V regularity in 
space (i.e., mj G 23V(fi) and m(-,t) is in 23V(Q) for all t G [0,1]); Lipschitz regularity of v in space not only 
implies that we propagate the regularity of mj to f G [0,1] but also that we obtain a unique map y from 
v [19, 26]. Under these assumptions it is shown in [19] that we have to equip J with an H 3 -seminorm 
to guarantee existence of a minimizer; this will result in a triharmonic control equation. The results 
reported in [19] were—due to numerical considerations—obtained using an H 1 -seminorm assuming that 
H 1 -regularity still yields smooth enough results. A critical result to support this claim can be found in [28] 
(see also [26]); the smoothness requirements for v are relaxed to a local Sobolev regularity; this relaxation 
will not preserve 23V regularity of the images [25]. However, existence and uniqueness of the flow y can be 
guaranteed if v has local Sobolev regularity in space and V • v is a bounded measurable function [28]; our 
incompressible formulation fulfills these requirements. According to [28], H 1 regularity of v will transport 
L 2 images to L 2 images (see also [19, 18]). Relaxing 23V-regularity to L 2 -integrability seems reasonable, 
especially since our scheme currently cannot handle 23V images; we have to use smooth representations 
of mp and mg to ensure numerical stability. An existence proof for a minimizer of an incompressible H 1 - 
flow that follows these arguments can be found in [18, pages 58ff]. A rigorous proof for our formulation 
remains open. We note that we can switch to an H 2 - or an H 3 -seminorm if our formulation does not meet 
the theoretical requirements; all the derivations and algorithmic features presented here will still apply 
(see §A; an exemplary result can be found in Fig. 11). Likewise, adding a parabolic regularization via a 
diffusion operator to the hyperbolic transport equation can be another strategy to ensure existence of a 
minimizer, even for an L 2 -integrable v [9]. 

As we mentioned in § 2 , our computational studies as well as the results reported in [19] suggest that 
an H 1 -seminorm together with a control on V • v seems to provide sufficient smoothness to converge to a 
locally optimal, diffeomorphic solution. 

Appendix C. Connection to Total Variation Regularization. 

Here we briefly comment on the connection between our nonlinear Stokes regularization model and 
a total variation regularization model. The total variation regularization model is given by 

(25) 5[u] = [ (Vv : Vv) 1//2 dx 

Jn 

Notice that the exponent in ( 6 ) will tend to 1/2 if we let v in ( 6 ) tend to 00 . Thus, if we replace the strain 
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Fig. 12. Illustration of the visualization of the computed deformation map y. From left to right: template image mj, deformed template 
image m\ (deformed configuration), deformed template image m\ with an illustration of the deformed grid as an overlay and a map of the 
determinant of the deformation gradient F i (as identified in the inset). The color map for det(Fi) is displayed on the right. Notice that the map 
y is defined in an Eulerian frame of reference (Eulerian description of motion), i.e., the map y models where points originate from. Accordingly, 
we have det(Fi) = det(Vi/ -1 ) = det(Vi/) -1 . 


rate tensor £ = \((Vv) + (Vu) T ) with Vv, (6) and (25) are equivalent as v -A oo. This equivalence is also 
reflected by the variations of both models. We obtain A[v] = —V • (Vv : Vv)~ 1 ^ 2 Vv for the first variation 
of (25) and 


B{v)[v] = -V • (Vv : Vz>)- 1/2 VS 

for the second variation of (25) with respect to v, respectively. These expressions are very similar to the 
operators in (10) and (12). As such, the derivations we presented in this work also hold if we replace (6) 
with (25). 

Appendix D. Illustration of Deformation Map. 

We report images of the deformation pattern (deformed grid) and maps of the determinant of the 
deformation gradient in order to illustrate local properties of the deformation map. Here, we provide 
information on how these were generated and on how to interpret them. 

D.l. Deformation Map. We illustrate regularity and local properties of the deformation map y on the 
basis of deformed grids. We define yasa perturbation from identity, i.e. y := x — u\, where u\ : Q -A R d , 
u\ := u(-,t = 1), u : Q x [0,1] -A R d f is some displacement field at final time t = 1. The latter can be 
computed from the velocity field v by solving 

(26) dtu + (Vu)v = v in Q x (0,1], u = 0 in Q x {0}, 

with periodic boundary conditions on 9Q. Note that y is defined in an Eulerian frame of reference (i.e., 
the deformed illustrates not where the points move to but where they originate from). An exemplary 
visualization of a synthetic deformation is shown in Fig. 12. 

D.2. Deformation Gradient. We report maps and values for the determinant of the deformation gra¬ 
dient to qualitatively and quantitatively assess regularity of a mapping y. In the framework of continuum 
mechanics the deformation tensor field F : Q x [0,1] -A R dxd can be computed from v by solving 

(27) d t F + (v • V)F = (Vv)F in Q x (0,1], F = I in fi x {0}, 

with periodic boundary conditions on 9Q. Here, I = diag(l,.. .,1) E R dxd and det(Fi) is identical to 
det(Vy) -1 , where F\ := F(-,f = 1), F\ : O —> R dxd f and y is the Eulerian deformation map. 

We limit the color map for the display of det(Fi) to [0,2]. In particular, the color map ranges from 
black (compression: det(Fi) E (0,1); black corresponds to values of 0 or below (due to clipping), which 
represents a singularity or the loss of mass, respectively) to orange (mass conservation: det(Fi) = 1) to 
white (expansion: det(Fi) > 1; white represents values of 2 or greater (due to clipping)). An illustration of 
this color map can be found in Fig. 12. Notice that none of the maps for the determinant of the deformation 
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gradient reported in this study values smaller than 0 (i.e., we do not report any degenerate deformation 
maps). 

Appendix E. Algorithm. 

Here, we provide more insight into our algorithm. We have added this information to the appendix, 
since we are mainly concerned with new regularization schemes in the present work. We refer to [60] 
for a detailed study of our globalized, inexact, preconditioned, reduced space (Gauss-)Newton-Krylov 
method for constrained diffeomorphic image registration; the study in [60] includes a comparison to a 
preconditioned gradient descent scheme. Note that our solver is not finalized. We are currently working 
on improvements to reduce the time to solution. 

E.l. The Reduced Space Newton-Krylov Method. We have seen in §3.1 that the first order opti¬ 
mality conditions for (3) are a system of space-time multicomponent nonlinear PDEs for the transported 
intensities m, the velocity field v, and the mass source w. As we have seen in §3.3 we can significantly 
simplify this system by exploiting variable elimination techniques to obtain (13); our algorithm will only 
operate on the reduced system. Note that this elimination not only reduces the computational complex¬ 
ity but also allows us to fulfill the hard constraint on V • v exactly. Given that we use a pseudospectral 
discretization in space, we can efficiently evaluate the resulting differential operators and their inverses. 

We apply a Newton linearization to solve the first order optimality system (8). This linearization re¬ 
sults in a huge, severely ill-conditioned, dense, multi-component system for the incremental state, adjoint, 
and control variables. In full space methods one directly solves this system. Our algorithm belongs to the 
class of reduced space methods (see e.g. [11, 12] for details); we do not solve for all unknowns of the KKT 
system simultaneously but eliminate the incremental state and adjoint variables from the system; we only 
iterate on the reduced space of the control variable v. Advantages of reduced space methods include 
better spectral properties of the reduced space Hessian, a similar structure of the forward and adjoint 
operators, and a reduction of the order of the KKT system (the reduced space Hessian is nothing but 
the Schur complement for v) to a size that is manageable [11]. Nonetheless, solving this system remains 
a significant challenge, given that the reduced space Hessian is still a large, ill-conditioned, dense, and 
compact operator. 19 Further, as we will see below, we have to solve the state and adjoint equations at each 
iteration—a direct consequence of the block elimination in reduced space methods. 

Next, we will discuss how the conceptual idea of a reduced space Newton-Krylov method relates to 
the optimality systems we have presented in §3. The reduced space KKT system is given by (15e). To 
solve this system, we have to evaluate to what we refer to as the reduced gradient g on the right-hand side 
of (15e). This involves the solution of the system (13) in sequential order: Given some v we first solve (13a) 
forward in time. This gives us m at t = 1, which we need for the terminal condition in (13d). 20 Next, we 
solve (13c) backward in time. Note that (13c) represents a transport equation for the mismatch between 
m\ and m r. Thus, A will (ideally) tend to zero as we approach a solution (local minimizer) of (3). After 
we have solved (13a) and (13c), we can evaluate the control equation (13e) for some trial v. Note that m 
and A in (13e) are essentially functions of v through (13a) and (13c), respectively. Thus, although the PDE 
constraints are linear in v, the inverse problem is not; it is non-linear in v. This non-linearity as well as 
the conditioning of our problem are the main reasons why we prefer second order optimization methods. 

Now, that we have found g, we can solve the reduced space KKT system in (15e) for the incremental 
control variable v, i.e., compute the update (search direction) for v; (15e) only provides the action of the 
reduced-space Hessian Ti on v (Hessian matvec); this is all we need, given that we use a PCG method to 
solve (15e) (i.e., our solver is matrix free). The incremental state and adjoint variables in and A are—like 
we have seen for m and A for the first order optimality conditions—functions of v through (15a) and (15c), 
respectively. Thus, each time we apply Ti to v, we have to solve (15a) and (15c). Also note that (15) not 
only depends on the incremental variables m, A, and v but also on m. A, and v; the systems are strongly 
coupled. 

To compute a minimizer to (3) we have to repeat this entire process several times until convergence. 
We refer to the steps for updating v as outer iterations and the steps for iteratively solving the reduced 


19 A study of the spectral properties of the reduced space Hessian for compressible and incompressible diffeomorphisms can be 
found in [60]. 

20 We assign the costs for this forward solve to the evaluation of the objective and not the gradient. 
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space KKT system as inner iterations. We summarize these steps in compact form in Alg. 1 and Alg. 2 (for 
a PCG method), respectively. 

The number of outer iterations depends on the rate of convergence of our scheme. A convergence 
study of our (Gauss-)Newton-Krylov scheme can be found in [60]. For a Newton-Krylov method we 
expect this convergence to be quadratic. However, given that we can not guarantee that the Hessian is 
positive definite far away from a (local) minimizer, we resort to a Gauss-Newton approximation. This is 
equivalent to dropping all expressions in which A appears in (15); we expect the rate of convergence to 
drop from quadratic to super-linear (see [60] for more details). The number of inner iterations depends 
on the tolerance of the PCG method and on the preconditioner for the KKT system. For the tolerance we 
follow standard numerical optimization literature [27, 31] and define it to be proportional to the relative 
£ 2 -norm of the reduced gradient (see [60] for details). The preconditioner is what we describe next. 


Algorithm 1 Outer iteration of the designed inexact Newton-Krylov method. 


1 

Vq a- 0; compute mj, Ag, J h (V q) and g [j; k <— 0 


2 

while true do 


3 

stop A- check for convergence 


4 

if stop break 


5 

v\ A- solve (17) given m^, Ajf, and gfe 

> Newton step (see Alg. 2) 

6 

Oik A- perform line search on v subject to the Armijo condition 


7 

4+ 1 <- 4 + 


8 

m *+i( t = 0 ) m T 


9 

A- solve (13a) forward in time given 

> forward solve 

10 

A fc +1 (t = 1) <- ( m R - m k+ i( f = !)) 


11 

A^ +1 A- solve (13c) backward in time given v^ +1 and 

> adjoint solve 

12 

compute J h {vl +1 ) and g\ +l given m h k+v \ h k+1 and v h k+l 


13 

k^k + 1 


14 

end while 



Algorithm 2 Newton step. We illustrate the solution of the reduced KKT system (17) using a PCG method 
at a given outer iteration k E N. The steps to compute the Hessian matrix vector product are given in 
lines 4-8. 


1 

Vk «— min(0.5,(||^|| 2 /||gg|| 2 ) 1,/2 ) 


2 

»o <-0, r 0 i - g h k , z 0 <r- (A h )~^ro, s 0 <- z 0 , l <- 0 


3 

while l < n do 


4 

3* 

II 

o 

T 

o 


5 

mf -S— solve (15a) forward in time given w[', v k and £*? 

> inc. forward solve 

6 

Af(f = 1) <- -m\ (f = 1) 


7 

A? f- solve (15c) backward in time given \ k , v k and uj 1 

> inc. adjoint solve 

8 

§; «— apply K 1 ! to s; as indicated in (15e) given Aj', A*, m k and m? 


9 

k/ «- ( ri,zi}/(si,§i ) 


10 

®f+i i- + KjS, 


11 

ri+i <~rt- K t §i 


12 

if Ik/+lll2 < Vk break 


13 

z/+l l- (^) _1 r/ + i 


14 

m <- ( z ;+i/ f /+i)/ ( z u r i) 


15 

S/+1 t- Z/+1 + Jt/S; 


16 

l^l + l 


17 

end while 



E.2. Preconditioning the Reduced Space KKT System. Preconditioning (17) is essential to provide 
an efficient solver. We consider a left preconditioner P based on the second variation of the quadratic 
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Table 5 

Summary of the 'performance measures. We report values for the relative change of the gradient (row 1) and the residual (row 2) to indicate 
inversion accuracy. Here, g[ j is the initial gradient and g ] f is the gradient at the final iteration k *; m h R is the reference and m\ the template image. 
We also report measures derived from deformation gradient F\ (row 3 and row 4). Where applicable, we also report overlap measures (rows 5-8) 
for "ground truth" segmentations of anatomical structures. Here, Lr denotes a label map for the reference image m R and Lj the corresponding 
label map for the template image m\; # denotes the cardinality of a set, \ denotes the complement, and U and Pi are the union and intersection of 
two sets, respectively. 


Description 

Symbol 

Definition 


relative change: reduced gradient 

IISIU 

llsMIi/IISoll! 


relative change: residual 

Mid 

\\ m R — m l 111 / II m jR - 

*411! 

distance of deformation gradient from identity 

D 

l|F?-I ||f 


determinant of the deformation gradient 

J 

det(Fj) 


overlap: Jaccard similarity coefficient 

JSC 

#(L R DL T )/#(L R U 

Lt) 

overlap: Dice similarity coefficient 

DSC 

2#(Lr n Lt)/(#Lr + 

#L t ) 

overlap: false positive error 

FPE 

#(Lt\L r )/#L t 


overlap: false negative error 

FNE 

#(L r \L t )/#L r 



regularization models that act on v; that is P := A h . 21 This is a common choice in the PDE constrained 
optimization community. This preconditioner has—for our numerical scheme—essentially no construction 
and application cost; the inversion and application of A h amounts to a spectral diagonal scaling. The 
system PCG sees is a compact perturbation of the identity: 22 

(28) (fivA )- 1 (p v A[v] + C[b\(v)) =v + (p v A)-*C[b\(v) = (id +(fi v A)~ l C$])v; 

the inverse of the operator ft A acts like as smoother on £[b]. 

Since the second variation of (6) can not directly be inverted in Fourier space (due to the complicated 
structure and the spatially varying viscosity; see (12)) we use the inverse of the vectorial Laplacian operator 
as a preconditioner in case the regularization model in (6) is considered. 

Appendix F. Performance Measures. 

We report different measures of registration performance. We summarize the definitions of these 
measures in Tab. 5. The inversion accuracy of our solver is controlled on the basis of a tolerance for the 
relative change of the reduced gradient. The quality of the inversion is assessed in terms of the relative 
change of the L 2 -distance (residual) between the images to be registered. For some of our experiments 
we report values that measure the agreement between label maps of anatomical structures. We quantify 
regularity of the deformation map y on the basis of measures computed from the deformation gradient 
(see §D.2). In particular, we report values for the determinant of the deformation gradient. These indicate 
local volume change and deformation regularity. We also report values for the distance of the deformation 
gradient from identity. 
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